The relative proportion of comorbidities among rhinitis and rhinosinusitis patients and their impact on visit burden

Abstract Background The aim was to evaluate the relative proportion of Non‐steroidal anti‐inflammatory drug exacerbated respiratory disease (NERD) and other comorbidities, and their impact on the burden of outpatient visits due to allergic rhinitis (AR), non‐allergic rhinitis (NAR), acute rhinosinusitis (ARS), and chronic rhinosinusitis with nasal polyps (CRSwNP) and without (CRSsNP). Methods We used hospital registry data of a random sample of 5080 rhinitis/rhinosinusitis patients diagnosed during 2005–2019. International Statistical Classification of Diseases and Related Health Problems (ICD10) diagnoses, visits, and other factors were collected from electronic health records by using information extraction and data processing methods. Cox's proportional hazards model was used for modeling the time to the next outpatient visit. Results The mean (±standard deviation) age of the population was 33.6 (±20.7) years and 56.1% were female. The relative proportion of AR, NAR, ARS, CRSsNP and CRSwNP, were 33.5%, 27.5%, 27.2%, 20.7%, and 10.9%, respectively. The most common other comorbidities were asthma (44.4%), other chronic respiratory diseases (38.5%), musculoskeletal diseases (38.4%), and cardiovascular diseases (35.7%). Non‐steroidal anti‐inflammatory drug exacerbated respiratory disease existed in 3.9% of all patients, and 17.7% of the CRSwNP group. The relative proportion of subjects having 1, 2, 3 and ≥ 4 other diseases were 18.0%, 17.6%, 17.0%, 37.0%, respectively. All diseases except AR, ARS, and mouth breathing, were associated with a high frequency of outpatient visits. Conclusions Our results revealed a high relative proportion of NERD and other comorbidities, which affect the burden of outpatient visits and hence confirm the socioeconomic impact of upper airway diseases.


and with nasal polyps
(CRSwNP) about 1%-4%. 4 Risk factors of these diseases include asthma, other allergic diseases, NERD, and smoking, in addition to genetic predisposition and host-environmental (-microbial) interactions. About 10% have severe disease, of which 70% have type 2 (eosinophilic) inflammation, CRSwNP, asthma/allergic multimorbidity, and/or NERD, whereas the remaining part of the uncontrolled cases has variable risk factors. [7][8][9] The proportion of NERD has shown to be about 16% among hospital CRSwNP patients. 10 Multiple chronic conditions have been shown to increase the frequency of physician visits. 11 We have previously shown that patients with at least one chronic disease have an increased risk of severe asthma. 9 We are not aware of previous literature on the overlap of diagnoses, comorbidities, and burden of outpatient visits due to rhinitis/rhinosinusitis.
This study was carried out to evaluate the relative proportion of NERD and other comorbidities and their impact on the burden of outpatient visits due to AR/NAR/ARS/CRS. Although inflammatory upper airway diseases have been shown to have a significant socioeconomic impact, their outpatient visit burden has been scarcely studied.

| Patients
This retrospective registry-based follow-up study on rhinitis or rhinosinusitis patients was carried out at the Departments of Allergy The patient variables for the study were collected and processed both from the structured and coded EHR data (c.f. Visits, procedure, and diagnosis codes) and free text from the hospital charts.   � Allergy (n = 1): J45.0, or J30., or EHR "allergy" (see Table E1 in the [Additional file 1]) � Immunodeficiency or suspicion of immunodeficiency (n = 1): B20, or D80-84, or EHR "immunodeficiency"

| The collected variables
The data extraction was performed by searching the diagnoses in the visit data and diagnostic data (see Table E2 in the [Additional file 1]). In addition, the patient chart texts were searched directly for the diagnostic code or terms referring to the disease in specified words (see Table E2 in the [Additional file 1]).
Allergic rhinitis diagnosis in EHR was based on a positive skin prick test or serum-specific immunoglobulin E (IgE) results, in addition to typical symptoms. non-allergic rhinitis diagnosis was based on typical symptoms that are not connected to known allergens and/or there is a lack of positive skin prick tests or serum-specific IgE results of known allergens that could be related to the symptoms during that season. CRS and CRSwNP were diagnosed according to European Position Paper on Rhinosinusitis and Nasal Polyps. 4 Doctor-diagnosed asthma means that asthma medication is reimbursed by the Social Insurance Institution of Finland. For this, asthma diagnosis is based on typical history and asthma symptoms, and findings of lung function test (spirometry and peak expiratory flow (PEF)) of at least 15% improvement with bronchodilator test in spirometry (in forced expiratory flow volume in one second (FEV1) or forced vital capacity (FVC)) and/or recurrent 20% diurnal variation in PEF monitoring or recurrent 15% bronchodilator response in PEF monitoring or positive methacholine challenge test (moderate to severe bronchial hyperresponsiveness), or positive lung-function test confirmed response to inhaled corticosteroid treatment. 12 Non-steroidal anti-inflammatory drug exacerbated respiratory disease diagnosis was based on a positive patient history of wheeze/cough or naso-ocular symptoms after intake of NSAID or additionally based on a positive reaction (wheeze and/or naso-ocular reaction) after acetylsalicylic acid (ASA) provocation test at the hospital. 13

| Information extraction from electronic health records
The information extraction method from the medical reports was based on two separate methods. In the first method, we searched

International Statistical Classification of Diseases and Related Health
Problems (ICD-10) codes directly from the clinical chart texts. 14 If any code related to a particular disease was found, then the patient's disease variable was given the value "True". If a patient had multiple diagnoses for different diseases, then the patient received a True value for each disease. If the patient had the codes J33 and J31, then the patient received True in both groups.
In the second method, we searched for keywords related to the basic diseases (such as "diabetes", "NERD"). 15 present example steps of the information extraction for the case of diabetes and NERD. In this example, the keywords of diabetes (translated to English) were: 'diabetes', 'sugar', 'blood sugar', and 'insulin'. The keywords for NERD were: 'aerd', 'samter', 'aspirin', and 'asa'. When a keyword was found from the clinical text, rule-based inference identified cases that were related to negation, family history, or good medical status (Column "Rule-based dictionary" in

| Data analysis
We used Python packages nltk, 17 scipy, 18 numpy, 19 pandas, 20 and matplotlib-venn 21 to implement the data processing, information extraction from clinical text and all statistical analysis. We used R packages survival 22 and glmnet 23 to model the time to the next visit.
Word tokenization for the keyword search was done by the function "tokenization" in the package nltk. Statistical tests were done using the function "stats" in the package scipy. The packages of numpy and pandas were used for data reading and processing. Venn diagrams were performed by using the function "venn3" in the package matplotlib-venn. We used the function coxph from the package survival for training Cox's proportional hazards models for modeling time to the next visit. The number of previous visits and background variables was used as predictors. The package glmnet was used for training the Least Absolute Shrinkage and Selection Operator (LASSO) model for exploring the best predictors for the hazard of the next visit. [24][25][26] The parameter λ of LASSO was searched for by cross-validation and 1 standard error from the minimum λ value was used. 23 Table E3 in the [Additional file 1] presents the characteristics of all patients. The mean age of the patients was 33.6 � 20.7 years, and 56.1% were female. The follow-up times did not differ between the groups (data not shown). The mean follow-up time of adults was 8.6 years and in children (<18 years) it was 8.0 years. The difference was statistically significant (p < 0.001). The relative proportion of diagnoses J30., J31., J32., J33, J01., reflecting patients with AR, NAR, CRSsNP, CRSwNP, and ARS, were 33.5%, 27.5%, 20.7%, 10.9%, and 27.2%, respectively (Table E3). Table E4 (Table E3). Table 1 presents the characteristics of the rhinosinusitis subgroups. The relative proportion of CRSsNP, CRSwNP, ARS, RARS, any CRS with acute exacerbation (AE) and, CRSwNP AE were 17.8%, 10.9%, 14.9%, 3.5%, 5.1% and 1.4%, respectively (Table 1). Table E5 in the [Additional file 1] presents cross-tabulation of any CRS, NUUTINEN ET AL.  We showed a high overlap of upper airway diagnoses of rhinitis/ rhinosinusitis patients (Table E3, Table 1, Figure 1 Venn-diagrams).

| The characteristics of patients with rhinitis/ rhinosinusitis
At least one other comorbidity/ies than rhinitis/rhinosinusitis was detected in 89.6% of cases (  Comorbid chronic respiratory diseases (other than asthma) were more frequent among NAR and AR patients than among CRS patients (

| The frequency of outpatient visits for rhinitis/ rhinosinusitis
The mean (�standard deviation) follow-up time for the patients in our study was 8.5 � 3.4 years (Table E3) With the LASSO model, we found that the visit frequency risk increased with the number of upper airway diseases; as compared to 1 disease, adjusted HR (coef) was 1.099 (0.09).

| DISCUSSION
We found a strong overlap of upper respiratory diseases. The most common comorbidities were other chronic respiratory diseases but also musculoskeletal and cardiovascular diseases. Comorbidities were associated with a high outpatient visit burden.
We detected that AR/NAR/ARS/CRS diagnoses were co-existing in about fifth of the present cases. Previous studies have confirmed the overlapping of these conditions, 2-4 although they differ in etiopathology, risk factors, and clinical picture.
We showed here that more than a third of rhinitis/rhinosinusitis We showed that all diseases except three (AR, acute rhinosinusitis, and mouth breathing), were associated with a high visit burden.
The number of inflammatory upper airway diseases increased the risk of visit burden. This result could help in patient counseling and planning of treatment processes. We have previously shown that patients with at least one chronic disease have an increased risk of severe asthma. 9 Multiple chronic conditions have been shown to increase the risk of physician visit frequency. 11 Also, AR burden in primary care has been shown to increase visit burden 40 as well as pediatric acute rhinosinusitis in hospital care. 41 Comorbid CRS has been shown to increase asthma-related emergency visits. 42 The data showed that extracting EHR data from the selected variables worked well in this type of study. The accuracy of EHR data extraction has previously been shown for example, in joint implant registries. 43 The limitation of EHR extraction method is that physicians may make EHR entries in different ways, or some information may not be found in the EHR at all. This source of bias was minimized by extracting data of a random sample of patients over a long period of time, from different physicians, and from a long follow-up period.
The strengths of this study include a large and random sample of patients with outpatient visits and the use of text mining of EHR texts, in addition to coded diagnoses. We showed that information extraction of EHR shows high performance in finding NERD patients as well as non-respiratory comorbidities of patients with rhinologic diseases. The retrospective character, selected hospital population and potential inadequate data extraction due to insufficient coding put some limitations on the study. The information about elsewhere visits such as general practitioners, occupational healthcare, or the private sector was not available. We acknowledge that the control group, data of symptom scores, medications, polyp scores, Lund-Mackay scores, etc.
were not available in this study. The role of sinus surgery has been analyzed elsewhere. 44 Relative proportion is not fully corresponding to prevalence, which may explain different results compared to general population studies. Some diagnoses, such as first J31 and later J30, may have been used in the same patient before and after allergy test results.
The physician sometimes enters only one diagnosis (J30) in cases of mixed rhinitis (J30 & J31), so in real life, the proportion of co-existing J30 and J31 diagnoses is likely to be higher. Local allergic rhinitis (LAR) is also caused by IgE-mediated reaction, but due to the lack of validated diagnostic tests, NAR diagnoses might also include LAR cases. Hence the findings need validation in other populations. In Finland, there are excellent EHR also in the basic healthcare, private sector, and occupational health, and similar analysis in these populations would provide valuable information about the overall disease burden.